refactor(BA-5713): collapse scheduler signatures to owner_id#11047
Closed
jopemachine wants to merge 7 commits into
Closed
refactor(BA-5713): collapse scheduler signatures to owner_id#11047jopemachine wants to merge 7 commits into
jopemachine wants to merge 7 commits into
Conversation
This was referenced Apr 14, 2026
719ff4b to
f811d03
Compare
3b7a216 to
18a8ec2
Compare
jopemachine
commented
Apr 14, 2026
jopemachine
commented
Apr 14, 2026
f811d03 to
c2d0d94
Compare
18a8ec2 to
17d3c3a
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
This PR is part of the BA-5650 refactor stack, updating scheduler- and scheduler-repository-facing types/queries to key scheduling behavior off owner_id and the owner’s resolved main_access_key (instead of relying on sessions.access_key snapshots).
Changes:
- Add a helper in scheduler predicates to resolve
main_access_keyfromusersviaSessionRow.user_uuid, and switch keypair-policy/concurrency lookups to use that value. - Update DRF scheduler fairness tracking to be keyed by user UUID instead of
AccessKey. - Rename several scheduler repository DTO fields from
access_key→main_access_keyanduser_uuid→owner_id, and remove redundantby_access_key_*condition factories from scheduler repository options. - Add “access_key → owner UUID” shims in stream/events DB sources.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
src/ai/backend/manager/scheduler/predicates.py |
Resolve owner main access key for keypair-scoped predicates (concurrency/resource-policy/pending limits). |
src/ai/backend/manager/scheduler/drf.py |
Switch DRF fairness bookkeeping to use user UUID keys. |
src/ai/backend/manager/repositories/stream/db_source/db_source.py |
Resolve owner_id from access_key before fetching streaming sessions. |
src/ai/backend/manager/repositories/events/db_source/db_source.py |
Resolve owner_id from access_key for event session lookups. |
src/ai/backend/manager/repositories/scheduler/types/session_creation.py |
Rename enqueue DTO fields to owner_id / main_access_key. |
src/ai/backend/manager/repositories/scheduler/types/session.py |
Rename scheduler session DTO fields to owner_id / main_access_key (partial). |
src/ai/backend/manager/repositories/scheduler/types/results.py |
Rename scheduled result DTO field to main_access_key. |
src/ai/backend/manager/repositories/scheduler/types/allocation.py |
Update comment to reflect main access key semantics. |
src/ai/backend/manager/repositories/scheduler/options.py |
Remove duplicated access-key string match condition factories; rename user UUID filter factory methods to owner_id naming. |
changes/11047.misc.md |
Changelog entry for the refactor slice. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
2caf306 to
c1a5f32
Compare
3ece8ba to
81cdac3
Compare
jopemachine
commented
Apr 14, 2026
jopemachine
commented
Apr 14, 2026
e2ff93b to
765d164
Compare
7ab7788 to
e10c89d
Compare
4ecaf01 to
82ea2d0
Compare
``SessionRepository`` and the underlying ``SessionDBSource`` now take ``owner_id: UUID`` on every method that previously accepted ``owner_access_key: AccessKey``. Affects: - ``get_session_validated`` - ``match_sessions`` - ``update_session_name`` - ``find_dependency_sessions`` / ``_find_dependent_sessions`` - ``get_target_session_ids`` - ``get_session_with_group`` The matching ``dependency_graph`` helpers and ``creators`` are updated in lockstep. Service-layer callers still pass ``owner_access_key`` temporarily; they will be migrated in the next slice.
Scheduler / predicates / scheduler-type collapse of the owner key: - ``scheduler/predicates.py``: predicates now take SessionRow and resolve ``main_access_key`` from the owner via a helper when a keypair-scoped lookup (Redis concurrency, keypair resource policy) is required. Renames ``SessionRow.user_uuid`` references throughout. - ``scheduler/drf.py``: per-user fairness tracking keyed by ``owner_id``/``main_access_key`` pair. - ``repositories/scheduler/options.py``: drop the duplicated ``by_access_key_*`` factories — session filters go through ``SessionConditions`` helpers instead. - ``repositories/scheduler/types/*``: rename ``access_key`` to ``main_access_key`` on ``ScheduledSessionData``, ``TerminatingSessionData``, ``SweptSessionInfo``, ``KernelEnqueueData`` and ``SessionEnqueueData``. - ``repositories/events/db_source/db_source.py`` and ``repositories/stream/db_source/db_source.py``: resolve the owner UUID from ``main_access_key`` via a sub-select shim while the schema still exposes ``sessions.access_key``.
- repositories/scheduler/types/session.py: rename PendingSessionData.access_key -> main_access_key (and drop the outdated resolved-main_access_key comment). - repositories/scheduler/db_source/db_source.py: update the PendingSessionData call site to main_access_key + owner_id keyword names matching the dataclass. - scheduler/drf.py: use existing_sess.user_uuid (SessionRow stores the owner UUID there, not owner_id). - scheduler/predicates.py: guard every _resolve_main_access_key consumer (check_concurrency, check_keypair_resource_limit, check_pending_session_count_limit, check_pending_session_resource_limit) with an early main_ak-is-None return so that NULL main_access_key users don't fall through to keypair policy lookups that match with NULL.
- repositories/scheduler/types/allocation.py: rename SessionAllocation's ``access_key`` field to ``main_access_key`` and drop the stale explanatory comment. - repositories/stream/db_source/db_source.py: raise UserNotFound when no user owns the supplied ``main_access_key`` — SessionNotFound was misleading since the failure is about the user lookup, not the session. - Drop the leftover ``changes/BA-5650-D.misc.md`` file.
- Scheduler db_source and preparer now use ``main_access_key`` / ``owner_id`` on ``SessionEnqueueData``, ``KernelEnqueueData``, ``TerminatingSessionData``, ``SweptSessionInfo``, and ``ScheduledSessionData``. - ``PendingSessions.owner_ids`` replaces ``user_uuids`` in db_source call sites. - ``scheduling_controller`` reads ``session_spec.access_key`` (SessionCreationSpec still uses this name pre-sokovan rename). - ``PendingSessionData.to_session_workload`` maps ``main_access_key`` → ``SessionWorkload.access_key`` and ``owner_id`` → ``SessionWorkload.user_uuid`` so the bridge works until slice F renames the sokovan SessionWorkload fields. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
e10c89d to
b3067da
Compare
Member
Author
|
Consolidated into the merged PR stack (A+B+C+D, E+F, G+H, I) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📚 Stack
mainMerge from top to bottom. Intermediate slices may not build standalone; the final refactor tip is #11051, and #11040 is the schema-drop follow-up on top of it.
Summary
scheduler/predicates.py: predicates resolve the owner'smain_access_keyvia helper; references toSessionRow.user_uuidthroughout.scheduler/drf.py: fairness keyed by(owner_id, main_access_key).repositories/scheduler/options.py: drop duplicatedby_access_key_*factories.repositories/scheduler/types/*: renameaccess_key→main_access_keyonScheduledSessionData,TerminatingSessionData,SweptSessionInfo,KernelEnqueueData,SessionEnqueueData.events/stream db_source: owner-UUID sub-select shim.Resolves BA-5713. Part of epic BA-5650.
BA-5650 Series: Split Rationale
Overall goal: migrate the session owner identifier from
access_key(keypair) toowner_id(user UUID), and drop thesessions.access_key/kernels.access_keycolumns.Split criteria: layer + dependency order. Bottom-up (DB helpers → service → API) so the destructive column drop can land safely at the end.
get_main_access_key_by_idand related resolver helpers (everything else depends on this)UserPermission.user_uuid → owner_id; addmain_access_keyfieldSessionData.user_uuid → owner_id; Row adapters; GQL nodeSessionRepository,SessionDBSource, creators signaturesowner_idfrom 21 read/control Actions; resolve viacurrent_user()owner_access_keyfrom REST v1 DTOs (breaking)Why this split
access_keyfromowner_id. BA-5653 (destructive) only runs once every reader has migrated off the dropped column.